This script has been generated by Onkar Mulay and Levi Hockey from the Genomics and Machine Learning Lab:
- Data used below is currently not published or publically available. For more information about the scripts or for the use of this datasets beyond this tutorial please contact o.mulay@uq.edu.au or l.hockey@uq.edu.au
- Code and conda environments used are available from https://github.com/GMLTestLab/qimr-teaching
Table of Contents¶
| Section | Description |
|---|---|
| Cell-Cell-Interactions | Overview of the notebook. |
| stLearn | In-house cell-cell interaction algorithm for cell-cell interaction of spatial dataset. |
| CellChat | Cell-cell interaction algorithm for single-cell and spatial data. |
| MMCCI | Integration of Cell-Cell Interaction Results. |
| ADVANCED | Advanced material for CCI and other interesting packages. |
Users should only run Inference of CCI and MMCCI; as cell-cell interactions is time consuming step, we have already run CCI and saved the results.¶
Cell-Cell-Interactions¶
Multicellular life relies on the coordination of cellular activities, which depend on cell–cell interactions (CCIs) across an organism’s diverse cell types and tissues. Thus, studies on cellular functions increasingly require consideration of the community context of each cell. CCIs leverage diverse molecules, including ions, metabolites, integrins, receptors, junction proteins, structural proteins, ligands and secreted proteins of the extracellular matrix. Some molecules support structural CCIs (for example, cell adhesion molecules), whereas ligands such as hormones, growth factors, chemokines, cytokines and neurotransmitters mediate cell–cell communication.
Cell–cell interactions orchestrate organismal development, homeostasis and single-cell functions. When cells do not properly interact or improperly decode molecular messages, disease ensues. Thus, the identification and quantification of intercellular signalling pathways has become a common analysis performed across diverse disciplines. (Armingol et al., 2020) https://www.nature.com/articles/s41576-020-00292-x
Computational Requirements¶
To perform cell-cell interaction in Python or R, we need:-
- Gene expression matrix
- Ligand-Receptor Database
- Spatial Location of cells/spots
- Cell-type annotations
LRI and CCI Score calculation followed by permutation test¶
CellChat¶
CellChat is R packge to perform cell-cell interactions of single cell and spatial data.
(Jin et al., 2021)
Steps to calculate interaction scores:-
- Identification of differentially expressed signaling genes.
- Calculation of ensemble average expression. To account for the noise effects of signalling genes in a given cell group.
- A random walk based network propagation technique to project the gene expression profiles onto a high-confidence experimentally validated protein-protein network from STRINGdb, to calculate LR-communication probability between two cell-types.
Code¶
suppressMessages(library(CellChat))
suppressMessages(library(Seurat))
For CellChat we load the skin-cancer Visium dataset. Four main functions that are used to create a cellchat object prior to running the algorithm are:-
GetAssayData: To get the RAW gene expression matrix from Seurat object
normalizeData: To normalise the gene expression data
computeCellDistance: Compute cell-cell distance based on the spatial coordinates
createCellChat: Create a cellchat object with required metadata
# Load the RAW spatial or single-cell data
obj <- readRDS("/working/joint_projects/P3903/teaching2024-winter-qimr/data/finalised_files_to_move/Visium_Skin_A2_cellchat.rds")
# Load the deconvolution or cell-annotation results
deconvolution = read.csv("/working/joint_projects/P3903/teaching2024-winter-qimr/data/finalised_files_to_move/visium_decon.csv",row.names=1)
obj <- obj[,rownames(deconvolution)] # Subset the good spots which have cell-type deconvolution results
obj <- AddMetaData(obj, deconvolution)
obj$cell_type <- colnames(deconvolution)[max.col(deconvolution,ties.method="first")]
obj <- obj[,rownames(na.omit(obj@meta.data))]
# Normalise the data
data.input = GetAssayData(obj, slot = "data", assay = "Spatial")
data.input <- normalizeData(data.input) # normalize data matrix
# Set the metadata
meta = data.frame(labels = Seurat::Idents(obj), slices = "slice1", row.names = names(Seurat::Idents(obj)))
meta$slices <- factor(meta$slices)
meta$labels <- factor(obj@meta.data$cell_type)
# Set the spatial boundry constraints - for spatial data
spatial.locs = GetTissueCoordinates(obj, scale = NULL, cols = c("imagerow", "imagecol"))
scalefactors = jsonlite::fromJSON(txt = file.path("/working/joint_projects/P3903/teaching2024-winter-qimr/data/", 'scalefactors_json.json')) # this contains scaling factors
spot.size = 65 # the theoretical spot size (um) in 10X Visium
conversion.factor = spot.size/scalefactors$spot_diameter_fullres
spatial.factors = data.frame(ratio = conversion.factor, tol = spot.size/2)
d.spatial <- computeCellDistance(coordinates = spatial.locs, ratio = spatial.factors$ratio, tol = spatial.factors$tol)
# Create a CellChat object
cellchat <- createCellChat(object = data.input, meta = meta, group.by = "labels",
datatype = "spatial", coordinates = spatial.locs, spatial.factors = spatial.factors)
cellchat
[1] "Create a CellChat object from a data matrix" Create a CellChat object from spatial transcriptomics data... Set cell identities for the new CellChat object The cell groups used for CellChat analysis are Imm_CD8..T.cell, Imm_Endothelial.cell, Imm_Fibroblast, Imm_Macrophage, Imm_Mast.Cells, KC_Cornified, KC_Differentiating, Melanocytes, Pericytes
An object of class CellChat created from a single dataset
18085 genes.
917 cells.
CellChat analysis of spatial data! The input spatial locations are
x_cent y_cent
AACACGTGCATCGCAC-1 26848 21796
AACAGGAAGAGCATAG-1 24386 24722
AACATCTAATGACCGG-1 23863 15211
AACATGCGCAAGTGAG-1 23746 21940
AACCAGAATCAGACGT-1 21184 10809
AACCATCGGAAGCGAC-1 23739 22335
Exploring CellChat LR-pairs database and selecting the required LR-pair category. Here, we select secreted signalling pairs for quick run.
options(repr.plot.height=7.5,repr.plot.width=15)
CellChatDB <- CellChatDB.human
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling") # use Secreted Signaling
showDatabaseCategory(CellChatDB)
# Show the structure of the database
dplyr::glimpse(CellChatDB$interaction)
# use a subset of CellChatDB for cell-cell communication analysis
CellChatDB.use <- subsetDB(CellChatDB, search = "Secreted Signaling") # use Secreted Signaling
# set the used database in the object
cellchat@DB <- CellChatDB.use
Rows: 3,249 Columns: 28 $ interaction_name <chr> "TGFB1_TGFBR1_TGFBR2", "TGFB2_TGFBR1_TGFBR2",… $ pathway_name <chr> "TGFb", "TGFb", "TGFb", "TGFb", "TGFb", "TGFb… $ ligand <chr> "TGFB1", "TGFB2", "TGFB3", "TGFB1", "TGFB1", … $ receptor <chr> "TGFbR1_R2", "TGFbR1_R2", "TGFbR1_R2", "ACVR1… $ agonist <chr> "TGFb agonist", "TGFb agonist", "TGFb agonist… $ antagonist <chr> "TGFb antagonist", "TGFb antagonist", "TGFb a… $ co_A_receptor <chr> "", "", "", "", "", "", "", "", "", "", "", "… $ co_I_receptor <chr> "TGFb inhibition receptor", "TGFb inhibition … $ evidence <chr> "KEGG: hsa04350", "KEGG: hsa04350", "KEGG: hs… $ annotation <chr> "Secreted Signaling", "Secreted Signaling", "… $ interaction_name_2 <chr> "TGFB1 - (TGFBR1+TGFBR2)", "TGFB2 - (TGFBR1+T… $ is_neurotransmitter <lgl> FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FAL… $ ligand.symbol <chr> "TGFB1", "TGFB2", "TGFB3", "TGFB1", "TGFB1", … $ ligand.family <chr> "TGF-beta", "TGF-beta", "TGF-beta", "TGF-beta… $ ligand.location <chr> "Extracellular matrix, Secreted, Extracellula… $ ligand.keyword <chr> "Disease variant, Signal, Reference proteome,… $ ligand.secreted_type <chr> "growth factor", "growth factor", "cytokine;g… $ ligand.transmembrane <lgl> FALSE, FALSE, TRUE, FALSE, FALSE, FALSE, FALS… $ receptor.symbol <chr> "TGFBR2, TGFBR1", "TGFBR2, TGFBR1", "TGFBR2, … $ receptor.family <chr> "Protein kinase superfamily, TKL Ser/Thr prot… $ receptor.location <chr> "Cell membrane, Secreted, Membrane raft, Cell… $ receptor.keyword <chr> "Membrane, Secreted, Disulfide bond, Kinase, … $ receptor.surfaceome_main <chr> "Receptors", "Receptors", "Receptors", "Recep… $ receptor.surfaceome_sub <chr> "Act.TGFB;Kinase", "Act.TGFB;Kinase", "Act.TG… $ receptor.adhesome <chr> "", "", "", "", "", "", "", "", "", "", "", "… $ receptor.secreted_type <chr> "", "", "", "", "", "", "", "", "", "", "", "… $ receptor.transmembrane <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRU… $ version <chr> "CellChatDB v1", "CellChatDB v1", "CellChatDB…
Preprocessing the expression data for cell-cell communication analysis¶
To infer the cell state-specific communications, we identify over-expressed ligands or receptors in one cell group and then identify over-expressed ligand-receptor interactions if either ligand or receptor is over-expressed.
cellchat <- subsetData(cellchat)
future::plan("multisession", workers = 4)
cellchat <- identifyOverExpressedGenes(cellchat)
cellchat <- identifyOverExpressedInteractions(cellchat)
The number of highly variable ligand-receptor pairs used for signaling inference is 972
Function projectData is to project gene expression data onto protein-protein interaction (PPI) network. Specifically, a diffusion process is used to smooth genes’ expression values based on their neighbors’ defined in a high-confidence experimentally validated protein-protein network. This function is useful when analyzing single-cell data with shallow sequencing depth because the projection reduces the dropout effects of signaling genes, in particular for possible zero expression of subunits of ligands/receptors. One might be concerned about the possible artifact introduced by this diffusion process, however, it will only introduce very weak communications.
USERS can also skip this step and set raw.use = TRUE in the function computeCommunProb().
# project gene expression data onto PPI (Optional: when running it, USER should set `raw.use = FALSE` in the function `computeCommunProb()` in order to use the projected data)
# cellchat <- projectData(cellchat, PPI.human)
# Run CellChat
cellchat <- computeCommunProb(cellchat, type = "truncatedMean", trim = 0.1,
distance.use = TRUE, interaction.range = 250, scale.distance = 0.01,
contact.dependent = TRUE, contact.range = 100)
cellchat <- filterCommunication(cellchat, min.cells = 8)
cellchat <- aggregateNet(cellchat)
cellchat@dr <- subsetCommunication(cellchat)
# saveRDS("cellchat.rds")
truncatedMean is used for calculating the average gene expression per cell group. [1] ">>> Run CellChat on spatial transcriptomics data using distances as constraints of the computed communication probability <<< [2024-05-30 07:55:21]"
Inference of cell-cell communication network - Run¶
CellChat infers the biologically significant cell-cell communication by assigning each interaction with a probability value and peforming a permutation test. CellChat models the probability of cell-cell communication by integrating gene expression with prior known knowledge of the interactions between signaling ligands, receptors and their cofactors using the law of mass action.
The number of inferred ligand-receptor pairs clearly depends on the method for calculating the average gene expression per cell group.
When analyzing unsorted single-cell transcriptomes, under the assumption that abundant cell populations tend to send collectively stronger signals than the rare cell populations, CellChat can also consider the effect of cell proportion in each cell group in the probability calculation. USER can set population.size = TRUE.
computeCommunProbPathway Compute the communication probability on signaling pathway level by summarizing all related ligands/receptors
# Load the saved CellChat object
cellchat <- readRDS("/working/joint_projects/P3903/teaching2024-winter-qimr/data/finalised_files_to_move/CellChat.rds")
cellchat <- computeCommunProbPathway(cellchat)
netVisual_circle To visualise cell-cell interactions in a network plot
netVisual_heatmap To visualise cell-cell interactions in a heatmap
groupSize <- as.numeric(table(cellchat@idents))
options(plot.repr.height=5,plot.repr.width=5)
p1<- netVisual_circle(cellchat@net$count, vertex.weight = rowSums(cellchat@net$count),
weight.scale = T, label.edge= F, title.name = "Number of interactions")
p1
options(repr.plot.height=7.5,repr.plot.width=7.5)
netVisual_heatmap(cellchat, measure = "count", color.heatmap = "Blues")
Do heatmap based on a single object
netVisual_aggregate with layout as circle for network plot or sptial for spatial location of CCI for $IGF$ pathway.
pathways.show <- c("IGF")
# Circle plot
p2 <- netVisual_aggregate(cellchat, signaling = pathways.show, layout = "circle")
# Spatial plot
options(plot.repr.height=5,plot.repr.width=10)
p3 <- netVisual_aggregate(cellchat, signaling = pathways.show, layout = "spatial",
edge.width.max = 2, vertex.size.max = 1, alpha.image = 0.2, vertex.label.cex = 3.5)
p3 & theme_void()
spatialFeaturePlot To visualise the LR coexpression on tissue
options(plot.repr.height=5,plot.repr.width=10)
spatialFeaturePlot(cellchat, pairLR.use = "IGF1_IGF1R", point.size = 3.5, do.binary = TRUE, cutoff = 0.05, enriched.only = F, color.heatmap = "Reds", direction = 1) & theme_void()
stLearn¶
stLearn is a python package to perform cell-cell interactions of spatial datasets like Visium, Xenium etc. stLearn has two test categories:-
- For testing neighbourhoods with significant enrichment of LR co-expression (neighbourhood LR analysis - to find spatial locations and significant LR pairs used for interactions).
- For finding cell type combinations with significantly greater interactions than other cell types across the tissue (cell type-specific CCI analysis).
Code¶
In this cell we load the Raw spatial dataset and filter out bad genes; the genes that are not present in more than 3 cells. We normalise the data using CPM normalisation per spot/cell.
data_dir = "/working/joint_projects/P3903/teaching2024-winter-qimr/data/finalised_files_to_move/"
# Reading the RAW data
visium = st.Read10X(
path=f"{data_dir}/Visium_Raw"
)
visium.var_names_make_unique()
# Filter bad spots and normalise
st.pp.filter_genes(visium, min_cells=3)
st.pp.normalize_total(visium)
In this cell we load the deconvolution results from CARD and add it to the anndata for Cell-Cell Interaction pipeline.
# Load the deconvolution results and visualise the format of the dataframe
card_path = f"{data_dir}/../../visium_decon.csv"
spot_mixtures = pd.read_csv(card_path, index_col=0)
spot_mixtures.head()
# Add the deconvolution results to
spot_mixtures['predicted_cell_type'] = spot_mixtures.idxmax(axis=1) # Get the cell type with the highest proportion
labels = spot_mixtures.loc[:,'predicted_cell_type'].values.astype(str) # Get the cell type labels
spot_mixtures = spot_mixtures.drop(['predicted_cell_type'], axis=1) # Drop the predicted cell type column
# Subset the visium to only include spots in the spot mixtures
visium = visium[spot_mixtures.index.values]
print('Spot mixture order correct?: ',
np.all(spot_mixtures.index.values==visium.obs_names.values)) # Check is in correct order
# NOTE: using the same key in data.obs & data.uns
visium.obs['cell_type'] = labels # Adding the dominant cell type labels per spot
visium.obs['cell_type'] = visium.obs['cell_type'].astype('category')
visium.uns['cell_type'] = spot_mixtures # Adding the cell type scores
# Plot the cell-type labels on tissue
st.pl.cluster_plot(visium, use_label='cell_type')
LR scoring and significance testing st.tl.cci.run¶
We load the LR Database. Then we first compute the LR interaction scores, and later calculate the spots that are significantly interacting using a background test correction. Calculation of LR interaction scores within a spot/cell or between neighboruing spots/cells is given by:-
Interactions within-spots/cells¶
$$ {LR}_{score}=\frac{{Expr}_{L,S}\times \left[Exp{r}_{R,S} > 0\right]+Exp{r}_{R,S| N}\times \left[Exp{r}_{L,S} > 0\right]}{2} $$
Interactions between-spots/cells¶
$$ {LR}_{score}= \frac{1}{2}\left(mean\big(Exp{r}_{L,S| N}\times \left[Exp{r}_{R,S} \, > \, 0\right]\big)\right.\\ +mean\big(Exp{r}_{R,S| N}\times \left[Exp{r}_{L,S} \, > \, 0\right]\big) $$
LR significance testing is a robust statistical method to test LR interactions, avoiding biases toward abundant LR pairs and random co-expression of non-interacting gene pairs across neighboring spots/cells. A random background of LR scores for non-interacting genes is established using genes not in the LR database but within the same expression ranges as each ligand and receptor gene in the LR pair being tested. These random genes, representing ligand and receptor expression, are randomly paired to generate non-interacting gene-gene pairs with equivalent expression levels to the LR pair. The LRscore is then calculated for each random pair to create the background distribution per spot and LR pair. A p-value is calculated for each spot and LR pair as the proportion of background scores across k random pairs that had a score greater than the LRscore, using Benjamini/Hochberg correction for multiple testing.
# Loading the LR databases available within stlearn (from NATMI)
lrs = st.tl.cci.load_lrs(['connectomeDB2020_lit'], species='human')
print(len(lrs))
# Running the analysis #
st.tl.cci.run(visium, lrs,
min_spots = 20, #Filter out any LR pairs with no scores for less than min_spots
distance=None, # None defaults to spot+immediate neighbours; distance=0 for within-spot mode
n_pairs=1000, # Number of random pairs to generate; low as example, recommend ~10,000
n_cpus=4, # Number of CPUs for parallel. If None, detects & use all available.
)
lr_info = visium.uns['lr_summary'] # A dataframe detailing the LR pairs ranked by number of significant spots.
print('\n', lr_info)
st.tl.cci.adj_pvals(visium, correct_axis='spot',
pval_adj_cutoff=0.05, adj_method='fdr_bh')
# Showing the rankings of the LR from a global and local perspective.
# Ranking based on number of significant hotspots.
st.pl.lr_summary(visium, n_top=500)
st.pl.lr_summary(visium, n_top=50, figsize=(10,3))
st.pl.lr_diagnostics(visium, figsize=(10,2.5))
st.pl.lr_n_spots(visium, n_top=50, figsize=(11, 3),
max_text=100)
st.pl.lr_n_spots(visium, n_top=500, figsize=(11, 3),
max_text=100)
best_lr = visium.uns['lr_summary'].index.values[0] # Just choosing one of the top from lr_summary
stats = ['lr_scores', 'p_vals', 'p_adjs', '-log10(p_adjs)']
fig, axes = plt.subplots(ncols=len(stats), figsize=(16,6))
for i, stat in enumerate(stats):
st.pl.lr_result_plot(visium, use_result=stat, use_lr=best_lr, show_color_bar=False, ax=axes[i])
axes[i].set_title(f'{best_lr} {stat}')
fig, axes = plt.subplots(ncols=2, figsize=(8,6))
st.pl.lr_result_plot(visium, use_result='-log10(p_adjs)', use_lr=best_lr, show_color_bar=False, ax=axes[0])
st.pl.lr_result_plot(visium, use_result='lr_sig_scores', use_lr=best_lr, show_color_bar=False, ax=axes[1])
axes[0].set_title(f'{best_lr} -log10(p_adjs)')
axes[1].set_title(f'{best_lr} lr_sig_scores')
st.pl.lr_plot(visium, best_lr, inner_size_prop=0.1, outer_mode='binary', pt_scale=5,
use_label=None, show_image=True,
sig_spots=False)
st.pl.lr_plot(visium, best_lr, outer_size_prop=1, outer_mode='binary', pt_scale=20,
use_label=None, show_image=True,
sig_spots=True)
Cell type-specific interaction analysis - st.tl.cci.run_cci¶
This test is to check if a pair of cell types interact using a given pair of LR. This test accounts for the fact that more than one cell type may be present at a given spot. The cell type interaction analysis uses the significant spot/bin/cell outputs from the spot LR analysis above. For each LR pair and spot, the count matrix CCILR of shape nc × nc is calculated, where nc is the number of all predicted cell types. Each row in CCILR corresponds to the signal emitting cell types (ligand expressing; sender), and each column to the signal detecting cell types (receptor expressing; receiver).
We run cci code and save the data for plotting
# Running the counting of co-occurence of cell types and LR expression hotspots #
st.tl.cci.run_cci(visium, 'cell_type', # Spot cell information either in data.obs or data.uns
min_spots=3, # Minimum number of spots for LR to be tested.
spot_mixtures=True, # If True will use the label transfer scores,
# so spots can have multiple cell types if score>cell_prop_cutoff
cell_prop_cutoff=0.2, # Spot considered to have cell type if score>0.2
sig_spots=True, # Only consider neighbourhoods of spots which had significant LR scores.
n_perms=100, # Permutations of cell information to get background, recommend ~1000,
n_cpus=4 # Number of CPUs for parallel. If None, detects & use all available.
)
st.pl.cci_check(visium, 'cell_type')
visium.write_h5ad(f"{data_dir}/visium_stlearn.h5ad")
Inference of cell-cell communication network - Run¶
import scanpy as sc
from matplotlib import pyplot as plt
import stlearn as st
/home/s4634945/conda_env/.conda/envs/MachineLearning/lib/python3.8/site-packages/stlearn/tools/microenv/cci/het.py:192: NumbaDeprecationWarning: The keyword argument 'nopython=False' was supplied. From Numba 0.59.0 the default is being changed to True and use of 'nopython=False' will raise a warning as the argument will have no effect. See https://numba.readthedocs.io/en/stable/reference/deprecation.html#deprecation-of-object-mode-fall-back-behaviour-when-using-jit for details. @jit(parallel=True, nopython=False)
# Load the saved object for plotting
data_dir = "/working/joint_projects/P3903/teaching2024-winter-qimr/data/finalised_files_to_move"
visium = sc.read_h5ad(f"{data_dir}/visium_stlearn.h5ad")
Visulaisation of a sample LR pair
best_lr = visium.uns['lr_summary'].index.values[0]
stats = ['lr_scores', 'p_vals', 'p_adjs', '-log10(p_adjs)']
fig, axes = plt.subplots(ncols=len(stats), figsize=(16,6))
for i, stat in enumerate(stats):
st.pl.lr_result_plot(visium, use_result=stat, use_lr=best_lr,
show_color_bar=False, ax=axes[i])
axes[i].set_title(f'{best_lr} {stat}')
Plot the cell-type labels
sc.pl.spatial(visium, color="cell_type")
Check interaction and cell type frequency correlation
st.pl.cci_check(visium, 'cell_type')
Chord plot of all interactions in a sample and for a specific LR pair
st.pl.lr_chord_plot(visium, 'cell_type')
st.pl.lr_chord_plot(visium, 'cell_type', best_lr)
MMCCI¶
Multimodal Cell-Cell Interaction Integration
MMCCI is a python package to integrate the cell-cell interaction results across multiple samples and multiple modalities. It also offers various downstream analyses to infer the changed in cellular and molecular mechanisms due to altered cellular interactions in diseases.
import multimodal_cci as mmcci
In this cell we load saved cell-cell interaction results from stLearn.
remove_insignificant: Step1 - we remove the interactions that are not significant.
calculate_overall_interactions: Step2 - we calculate the significant cell-cell interactions for all LR-pairs.
data_dir = "/working/joint_projects/P3903/teaching2024-winter-qimr/data/finalised_files_to_move/"
# Load the data
visium = mmcci.tl.read_stLearn(f"{data_dir}/visium_stlearn.h5ad", return_adata=True)
visium['lr_scores'] = mmcci.it.remove_insignificant(visium['lr_scores'], visium['lr_pvals'])
overall = mmcci.it.calculate_overall_interactions(visium['lr_scores'])
mmcci.plt.network_plot(overall)
lr_interaction_clustering: This function performs clustering using the LR-interaction scores per spot obtained from stLearn. We can visualise the spatial location of cell-cell interaction programs involving one or multiple cell-types.
mmcci.an.lr_interaction_clustering(visium['adata'], clustering="leiden")
lrs_per_celltype: This function returns the ordered/ranked LR pairs by proportion that are interacting between two cell-types.
mmcci.plt.lrs_per_celltype(visium['lr_scores'], sender="Imm_Fibroblast", receiver="Melanocytes", n=10)
mmcci.plt.lrs_per_celltype(visium['lr_scores'], sender="Melanocytes", receiver="KC_Differentiating", n=10)
mmcci.plt.lrs_per_celltype(visium['lr_scores'], sender="KC_Differentiating", receiver="Melanocytes", n=10)
run_gsea: Find the pathways for the LR pairs involved in interaction
results = mmcci.an.run_gsea(visium['lr_scores'], gene_sets=["KEGG_2021_Human", "MSigDB_Hallmark_2020"], show_plots=True)
# results.to_csv("/scratch/user/s4634945/Others/Training/results_gsea.csv")
results["Term"][10:30]
10 Rap1 signaling pathway 11 Malaria 12 MAPK signaling pathway 13 Breast cancer 14 Ras signaling pathway 15 Regulation of actin cytoskeleton 16 TGF-beta signaling pathway 17 Gastric cancer 18 Hematopoietic cell lineage 19 Signaling pathways regulating pluripotency of ... 20 Hippo signaling pathway 21 Complement and coagulation cascades 22 Hypertrophic cardiomyopathy 23 Wnt signaling pathway 24 Dilated cardiomyopathy 25 AGE-RAGE signaling pathway in diabetic complic... 26 Calcium signaling pathway 27 Amoebiasis 28 Chemokine signaling pathway 29 Fluid shear stress and atherosclerosis Name: Term, dtype: object
results = pd.read_csv("/working/joint_projects/P3903/teaching2024-winter-qimr/data/results_gsea.csv",index_col=0)
grouped = mmcci.an.pathway_subset(visium['lr_scores'], results, ["Epithelial Mesenchymal Transition"], strict=True)
grouped_overall = mmcci.it.calculate_overall_interactions(grouped)
mmcci.plt.network_plot(grouped_overall, remove_unconnected=False)
mmcci.plt.lr_barplot(grouped, n=10)
ADVANCED¶
Other interesting CCI tools¶
SpatialDM https://github.com/StatBiomed/SpatialDM A statistical model and toolbox to quickly identify the spatial co-expression
CellPhoneDB https://github.com/Teichlab/cellphonedb Cell-type specific LR-interactions
NCEM https://github.com/theislab/ncem Deep Learning based cell-cell interaction algorithm
MMCCI https://www.biorxiv.org/content/10.1101/2024.02.28.582639v1.full